JUFIT: A Configurable Rule Engine for Filtering and Generating New Multilingual UMLS Terms
نویسندگان
چکیده
We here describe JuFiT, an easily adjustable rule engine which allows to filter non-natural terms (i.e., ones usually not occurring in running citation texts) from the Umls metathesaurus and even adds new terms to the UMLS (by rewriting non-natural terms). Unlike previous attempts (with MetaMap or Casper), JuFiT serves multilingual purposes in that it runs for English, Spanish, French, German and Dutch documents, as well - the most prominent European languages in terms of UMLS coverage. We evaluated JuFiT under a variety of experimental conditions and found evidence that it increases annotation quality for English, and most likely also for German and Spanish.
منابع مشابه
Unsupervised Disambiguation for a Multilingual Medical Information System using UMLS
This paper describes techniques for unsupervised word sense disambiguation of English and German medical documents using the Unified Medical Language System (UMLS). We present both monolingual techniques which rely only on the structure of UMLS, and bilingual techniques which also rely on the availability of parallel corpora. The best results are obtained using relationships between terms given...
متن کاملQuantifying the Impact and Extent of Undocumented Biomedical Synonymy Supporting Information
Consistent with previous observations [1, 2, 3], we noticed that many of the terms contained within the UMLS Metathesaurus were inappropriate for natural language-oriented analyses (ex: database-specific encodings, machine permutations, non-English language entries, etc.). Therefore, prior to generating the terminologies utilized in this study, we subjected the Metathesaurus to a thorough, rule...
متن کاملMultilingual Ontology Enrichment for Semantic Annotation and Retrieval of Medical Information
Background: Knowledge management in the European project Noesis addresses concept-based annotation and multilingual Information Retrieval of documents. Objective: Multilingual enrichment of a concept-based terminology in the medical field. Experience and evaluation in the domain of cardiovascular diseases by enriching a subset of the MeSH thesaurus in six European languages. This terminology, r...
متن کاملIndexFinder: A Method of Extracting Key Concepts from Clinical Texts for Indexing
Extracting key concepts from clinical texts for indexing is an important task in implementing a medical digital library. Several methods are proposed for mapping free text into standard terms defined by the Unified Medical Language System (UMLS). For example, natural language processing techniques are used to map identified noun phrases into concepts. They are, however, not appropriate for real...
متن کاملUse of Semantic Similarity and Web Usage Mining to Alleviate the Drawbacks of User-Based Collaborative Filtering Recommender Systems
One of the most famous methods for recommendation is user-based Collaborative Filtering (CF). This system compares active user’s items rating with historical rating records of other users to find similar users and recommending items which seems interesting to these similar users and have not been rated by the active user. As a way of computing recommendations, the ultimate goal of the user-ba...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- AMIA ... Annual Symposium proceedings. AMIA Symposium
دوره 2015 شماره
صفحات -
تاریخ انتشار 2015